1 


'!i!' 


ifir 


JOURNAL  OF  GEOPHYSICAL  RESEARCH.  VOL.  93,  NO.  C7,  PAGES  8099-8110,  JULY  15,  1988 


c V 


OIK  FILE  Copy 


Transient  Tracers  as  a  Problem  in  Control  Theory 


Carl  Wunsch 


Center  for  Meteorology  and  Physical  Oceanography,  Department  of  Earth,  Atmospheric,  and  Planetary  Sciences 
Massachusetts  Institute  of  Technology,  Cambridge 


The  problem  of  interpreting  transient  tracer  surveys  in  the  ocean  is  formally  identified  as  correspond¬ 
ing  to  placing  a  “terminal  constraint”  on  a‘“distributed  system  boundary  control  problem.”  The  math¬ 
ematics  available  in  control  theory  can  then  be  brought  to  bear  on  the  tracer  data.  Some  of  control 
theory  is  reviewed  in  the  context  of  a  simple  tracer  example  to  isolate  the  major  issues.  To  use  a  transient 
tracer  to  invert  for  flow  and  mixing  rates  involves  a  two-step  process:  start  with  an  initial  model,  found 
independently,  and  determine  if  acceptable  boundary  conditions  drive  the  model  to  reproduce  the 
interior  transient  tracer  at  the  observation  times.  If  the  model  succeeds  in  that  reproduction,  one  stops; 
the  model  is  adequate  and  need  not  be  changed  Only  if  this  test  fails  does  one  obtain  constraints  on  the 
fluid  flow  and  mixing,  which  can  be  invoked  in  parameter  estimation  techniques  of  control  theory. 
Terminal  constraint  observations  can  also  be  used  to  estimate  the  tracer  concentrations  at  earlier  times 
using  a  smoothing  filter. 
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1.  Introduction 


Direct  inference  of  ocean  circulation  parameters  (flow  and 
mixing  rates)  from  measurements  of  transient  “dyes”  in  the 
ocean  is  not  entirely  straightforward.  In  two  previous  papers 
[Wunsch,  1987,  1988]  (hereafter  referred  to  PI  and  P2,  respec¬ 
tively)  I  began  what  was  intended  to  be  an  exploration  of 
procedures  for  making  direct  inferences  about  the  circulation 
while  simultaneously  evaluating  the  information  content  of 
the  tracer  fields  relative  to  that  of  more  conventional  oceano¬ 
graphic  measurements. 

PI  pointed  out  that  unlike  the  problem  of  steady  tracers, 
one  had  to  distinguish  between  three  distinct  types  of  math¬ 
ematical  system:  (1)  the  forward  problem,  (2)  the  inverse  prob¬ 
lem,  (3)  the  regularization  problem.  The  results  can  be  summa¬ 
rized  by  recognizing  that  most  discussions  of  the  solution  of 
partial  differential  equations  focus  upon  the  conventional, 
well-posed  forward  problem,  in  which  perfect  (in  4he  Cauchy- 
Hadamard  sense)  boundary  conditions  are  used  to  step  a 
system  forward  in  time  and  space.  Almost  all  ocean  models 
are  formulated  in  this  sense.  Unfortunately,  the  proWfem  of 
using  oceanic  observations  with  models  (not  just  tracer 
models)  is  fundamentally  that  of  making  inferences  about  ill- 
posed  systems.  (The  terminology  “ill-posed”  is  somewhat  un¬ 
fortunate,  suggesting  shortcomings  in  the  investigator  who 
works  with  such  systems.  But  well-posed  problems  exist 
mainly  in  textbooks  and  rarely  in  the  universe  of  scientists 
trying  to  compare  models  with  data). 

The  inverse  problem  for  tracers,  steady  or  otherwise,  is 
easily  stated:  given  a  set  of  observations  of  a  tracer  C(x,,  t,) 
distributed  irregularly  and  with  specificable  noise  statistics 
over  an  interior  domain  and  on  the  boundaries,  infer  the  fluid 
flow  parameters.  For  steady  ocean  tracers,  a  decade  of  work 
has  led  to  a  variety  of  procedures  for  making  inferences,  either 
in  a  statistical  sense  (e.g..  maximum  likelihood)  or  in  a  bound¬ 
ing  envelope  sense  (see  for  example,  Wunsch  [1984]  or  Schlit- 
:er  [1987]). 

In  PI  and  P2,  it  was  shown,  however,  that  once  time  is 
admitted  to  the  problem  in  the  shape  of  a  tracer  transient 
(while  still  keeping  the  flow  field  steady),  inferences  about  the 
flow  parameters  had  to  be  deferred  until  the  regularization 


problem  was  solved.  By  regularization  is  meant  the  problem 
of  determining  whether  the  boundary  conditions  governing 
the  evolution  of  the  tracer  field  could  be  determined  from 
interior,  noisy  observations.  As  is  discussed  in  PI,  regu¬ 
larization  is  equivalent  to  solving  a  diffuse  system  upstream 
and  backwards  in  time.  That  such  solutions  are  possible  was 
demonstrated  by  one-dimensional  pipe  flow  models  through 
what  is  called  a  “whole-domain  method,”  in  which  the  unsta¬ 
ble  components  of  the  solution  are  controlled  in  a  global  sense 
by  treating  all  of  space  and  time  simultaneously.  Although  not 
an  inverse  problem,  the  methods  employed  are  similar  to 
those  used  for  inverse  ones. 

In  P2,  application  was  made  to  determining  ventilation 
rates  of  the  eastern  Atlantic  thermocline,  combining  geo- 
strophic  and  vorticity  constraints  with  those  of  helium-3  (3He) 
and  tritium  (3H)  in  a  three-dimensional,  time-dependent 
system.  Regularization  was  accomplished  by  writing  the  miss¬ 
ing  boundary  conditions  as  unknown  coefficients  of  a  bound¬ 
ary  Green’s  function.  The  procedures  used  in  both  papers  were 
all  variants  of  whole  domain  ones  derived  from  inverse  meth¬ 
ods. 

The  great  advantages  of  whole  domain  methods  are  their 
'simplicity  of  concept,  their  power,  and  flexibility.  Their  great 
disadvantage  is  the  rapid  escalation  of  the  computational  load 
that  occurs  in  three  space  dimensions  and  time.  Much  of  the 
discussion  in  P2  concerned  means  to  reduce  the  system  size  to 
one  more  consistent  with  the  available  data. 

At  the  time  of  completion  of  PI,  it  had  already  become 
clear  to  me  that  many  of  the  methods  being  employed  could 
be  unified  under  the  general  umbrella  of  “control  theory.”  The 
purpose  of  this  present  note  is  to  point  out  the  formal  simi¬ 
larity  between  many  control  problems  and  solutions  and  those 
involved  in  understanding  and  using  ocean  tracers.  The  calcu¬ 
lations  are  freed  of  the  many  regional  oceanographic  problems 
of  P2  so  as  to  isolate  the  mathematical  issues,  which  are  very 
general.  This  paper  is  not  a  review  of  control  theory;  that 
subject  is  very  large,  subsuming  major  parts  of  modern  engi¬ 
neering  and  pure  and  applied  mathematics.  Indeed,  one  of  the 
obstacles  to  learning  control  methods  is  the  overwhelming  size 
of  the  literature. 
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Every  practical  tracer  situation  is  different  and  no  brief  dis¬ 
cussion  can  possibly  be  all  inclusive.  So  for  purposes  of  illus- 
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tration,  the  problem  will  be  formulated  as  almost  a  “cartoon" 
tracer  problem,  abstracted  from  the  situation  in  P2,  as  follows. 

A  tracer,  denoted  C,  is  supposed  to  satisfy  a  generic 
advection-dilTusion  equation  of  form 

dC 

—  +  vVC  =  VaVC-  XC  +  Q  (1) 

dt  * 

where  a  is  a  mixing  tensor,  X  represents  a  decay  rate,  and  Q 
represents  interior  sources  and  sinks. 

We  suppose  that  as  with  Freons,  3H,  and  3He,  a  time  his¬ 
tory  of  the  surface  concentration  C(x,  y,  z  =  0,  r)  is  known, 
with  an  error  e(x,  t)  of  specified  mean  square.  Suppose  further, 
that  at  time  if,  an  oceanographer  arrives  on  the  scene  and 
surveys  the  tracer  concentration  within  some  interior  volume 
of  ocean,  obtaining  a  set  of  concentrations 

Q*i,  tf)  =  C(X„  t{)  +  £(X,,  t{)  (2) 

where  e  again  represents  the  errors  (analytical  and  sampling). 
The  survey  time  is  supposed  sufficiently  short  that  we  can 
regard  all  the  observations  as  having  occurred  at  the  single 
time  tf. 

Suppose  at  t  =  0,  the  initial  tracer  distribution  in  the  region 
is  known  C(x,  t  =  0)  =  C0(x).  C0  may  well  be  zero,  with  little 
or  no  uncertainty,  as  with  the  tracers  already  mentioned,  if 
f  =  0  is  in  the  early  1950s  (say),  or  it  might  be  the  result  of  a 
previous  survey  some  years  before,  in  which  case  its  errors 
must  also  be  accounted  for. 

Conventionally,  data  like  those  described  (e.g.,  the  Transient 
Tracers  survey  of  the  North  Atlantic)  have  been  used  as  fol¬ 
lows:  one  takes  a  dynamical  model  (as  Sarmiento  [1983]  took 
the  K.  Bryan  model),  imposes  the  boundary  conditions  at  the 
surface,  and  computes  the  model  distribution  of  tracer 
through  time,  stopping  the  calculation  of  model  time  tf.  The 
calculated  tracer  is  compared  with  the  measured  one,  and  the 
result  is  deemed  acceptable  or  not.  If  the  results  are  sufficiently 
similar,  one  concludes  that  the  tracer  distribution  has  thus 
“verified”  the  model. 

Consider  now  the  more  common  problem  when  there  are 
discrepancies  too  large  to  ignore.  It  is  often  asserted  that  the 
great  power  of  tracers  lies  in  their  integration  of  the  circu¬ 
lation  over  long  distances  and  times.  But  this  integration  is 
simultaneously  their  great  weakness.  Suppose  the  model  allud¬ 
ed  to  were  “perfect,”  having  correct  flows  and  mixing.  How¬ 
ever,  if  there  are  slight  systematic  errors  in  the  surface  bound¬ 
ary  conditions  (e.g.,  that  the  tritium  concentration  in  some 
region  is  estimated  to  be  10%  higher  than  it  really  was  for  10 
years),  then  this  small  systematic  error  will  accumulate  in  the 
model,  in  some  cases  at  regions  far  from  the  initial  difficulty. 
The  final  comparison  between  computation  and  measurement 
may  well  be  a  poor  one,  leading  perhaps  to  the  incorrect 
conclusion  that  the  model  was  in  error.  It  was  this  specific 
concern  about  boundary  conditions  which  motivated  PI.  P2 
showed  that  uncertainty  about  the  tracer  time  history  at  the 
boundaries  leads  to  treating  the  boundary  conditions  as  part 
of  the  problem  unknowns,  rather  than,  as  one  is  taught  and 
teaches,  as  "given.”  The  necessity  of  treating  boundary  con¬ 
ditions  as  part  of  the  system  unknowns  is  what  leads  us  to 
control  procedures. 

Consider  a  bounded  volume  of  ocean.  Equation  (1)  is  now 
supposed  to  govern  a  region  with  v,  a  known  perfectly  and 
completely.  Equation  (1)  can  be  rewritten  as 

(T(x,  t) 

— - - =  d,(C(x,  f).  Q)  (3) 


or,  in  finite  differences,  or  finite  elements,  etc.,  as 

C(t  +  1)  =  AC<()  +  Bu(r)  (4) 

where  the  matrices  A  and  B  arc  constant  with  time  (to  keep 
the  discussion  as  simple  as  possible;  this  assumption  is  not 
necessary).  The  elements  of  the  vector  C(/)  are  the  values  C(x„ 
t )  at  the  grid  points  x,  or  the  finite  element  coefficients  at  time 
(  or  any  other  proper  representation  of  the  “state"  variables  C. 
To  generate  simple  examples,  I  will  use  an  elementary  4x4 
box  model  (Figure  1),  a  reduced  version  of  that  used  in  P2.  Let 
the  net  flux  of  mass  from  box  i  to  box  j  be  written  J f .,  with  the 
resulting  tracer  flux,  then  C,7,v.  The  model  is  thus  represented 
by  a  crude  form  of  upstream  differencing,  but  1  emphasize  that 
much  more  complex  discrete  representations  of  (1)  can  be 
written  in  the  form  (4)  and  the  simplification  to  a  box  model 
does  not  remove  any  of  the  fundamental  mathematical  issues; 
the  appendix  makes  this  assertion  explicit.  The  model  in 
Figure  1  is  intended  to  be  abstract  and  generic  to  focus  on  the 
essential  mathematical  issues.  For  convenience  of  reference, 
the  boxes  numbered  1-4  sometimes  will  be  referred  to  as  the 
"surface"  layer,  which  might  correspond  to  the  ocean  surface 
layer,  with  the  horizontal  coordinate  being  either  latitude  or 
longitude.  Explicit  geographical  identification  is,  however,  nei¬ 
ther  necessary  nor  intended  here.  Similarly,  the  physical  inter¬ 
pretation  of  the  Jtj  need  not  distract  us;  readers  who  wish  to 
pursue  the  question  are  referred  to  P2  or  Keeling  and  Bolin 
[1967], 

Under  these  circumstances,  for  any  interior  box  i,  in  the 
absence  of  sources  or  sinks,  the  time  evolution  is  described  as 

CM  +  1)  =  Qf)  -  AArC.f!)  -  ^  £  C,(t)J0  +  ^  X  Cp)JJt 

y  i  v  i 

where  the  summation  on  j  is  over  all  neighbors  to  box  i,  A t  is 
the  time  step,  and  V  is  the  box  “volume."  We  took  A l/V  =  0.1. 
The  rows  of  A  appearing  in  (4)  sum  to  1  —  because  mass 
conservation  requires  the  Jtj  and  Jjt  entering  and  leaving  a 
box  to  sum  to  zero,  except  in  the  boundary  boxes;  X  has  been 
set  to  zero  here. 

The  term  Bu  represents  the  boundary  conditions  in  the 
shaded  boxes  of  Figure  1 ;  they  are  being  taken  here  as  speci¬ 
fied  values  of  the  time  derivatives  of  the  concentrations  there. 
Unless  otherwise  specified,  all  vectors  are  column  vectors. 
More  generally,  this  term  can  also  represent  any  interior 
sources  or  sinks.  The  separation  of  the  structures  of  B  and  u  is 
somewhat  arbitrary;  they  can  be  chosen  at  the  investigator's 
convenience.  Here  I  have  opted  to  let  B  consist  of  a  matrix 
which  specifies,  through  its  nonzero  elements,  those  boxes  in 
which  boundary  conditions  affect  the  interior  and  put  interior 
Q ,  including  X,  to  zero.  The  vector  u  is  chosen  to  be  a  scalar 
function  of  time  producing  time  histories  in  the  active  boxes 
which  are  identical.  To  be  as  explicit  as  possible,  the  full  A 
matrix  and  B  are  written  out  in  Figure  2. 

2.1.  The  Forward  Problem 

The  forward  problem  is  specify  Bu  (see  Figure  3<i)  and  cal¬ 
culate  forward  in  time  from  an  initial  concentration 
C(f  =  0)  =  0.  After  t f  =  14  time  steps  (define  A t  =  I),  the  dis¬ 
tribution  of  tracer  in  the  boxes  is  as  shown  in  Figure  3 b.  (P2 
discusses  the  size  of  the  time  step  necessary  for  stability.)  For 
illustration,  this  solution  is  deemed  the  reference  case:  the 
fluxes  J{j  are  assumed  to  be  correct,  and  the  tracer  distri¬ 
bution,  at  the  end  of  tf  time  steps  is  taken  as  "truth."  Here  u 
was  the  scalar  n  =  exp  (t),  t  =  0  to  6.  u  =  0.  thereafter  so  that 
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Fig.  1.  Simple  box  model  used  lo  illustrate  control  form  of  transient  tracer  problem.  The  four  corner  boxes  (stippled) 
are  not  involved  in  the  calculation;  boxes  with  partial  shading  are  regions  in  which  formal  boundary  conditions  are 
required.  The  flow  fields  d,  used  in  the  computations  are  displayed  with  the  numerical  values  attached.  Concentrations  in 
boundary  boxes  from  which  there  is  no  flow  into  the  interior  are  physically  irrelevant  (“unobservable"),  e  g.,  box  9. 
Concentrations  in  boxes  8,  12.  14.  and  15  are  specified  as  zero.  Active  interior  boxes  are  6,  7,  10,  and  11. 


all  boundary  boxes  at  the  top  and  left  have  the  same  history. 
B  vanishes  in  the  bottom  and  right  boxes,  fixing  zero  co  :- 
centration  there  (Figure  2  (bottom)). 

A  simple  illustration  of  the  bias  problem  is  obtained  by 
recomputing  in  the  forward  direction  with  the  boundary  con¬ 
centration  time  rate  of  change  in  box  5  artificially  raised  by 
0.5  tracer  units.  At  tf  =  14.  the  biased  forward  calculation 
gives  the  values  displayed  in  Figure  4.  The  error  in  C((y)  is 
caused  solely  by  the  error  in  the  boundary  conditions.  (Use  of 
the  concentration  time  rate  of  change  for  the  boundary  con¬ 
ditions.  rather  than  concentration  itself,  does  exaggerate  the 
effect  of  the  systematic  error.  But  the  present  system  is  being 
computed  for  only  14  time  steps  and  basin  scale  models  inte¬ 
grated  for  thousands  of  time  steps  would  accumulate  poten¬ 
tially  massive  systematic  errors  from  concentration  boundary 
conditions.) 

2.2.  The  Inverse  Problem 

The  oceanographer  arrives  with  his  ship  at  time  (  =  tf  and 
measures  with  some  error  the  concentration  shown  in  the 
boxes  of  Figure  3b,  including  the  boundary  boxes.  He  may 
also  know  the  boundary  concentrations  of  Figure  3a  through 
time  in  the  surface  boxes,  again  with  an  error.  The  inverse 
problem  is  infer  the  Jtj  (or.  equivalently,  the  elements  AiJ  of 
equation  (4)). 

As  is  discussed  in  P2.  unlike  the  steady  problem  this  inverse 
calculation  is  nearly  intractable:  from  one  survey  at  t  =  tf  we 
do  not  know  C(r  +  1)  —  C(()  on  the  time  scale  required  for 
numerical  stability  in  (4),  and  we  do  not  know  the  time  histor¬ 
ies  in  the  left  boundary  boxes  of  Figure  2. 

Use  of  the  values  of  C((/)  to  constrain  the  Jtj  to  make 
improved  estimates  of  them  by  inversion  must  therefore  be 
deferred,  pending  the  outcome  of  the  calculations  described  in 
the  next  section. 

2.3.  The  Control  Problem 

Let  us  restate  the  problem.  Given  the  C„.  the  “terminal 
constraint"  C'Uf )  =  Cd.  the  evolution  equation  (4),  and  any 
restrictions  wc  might  wish  to  place  on  Bu,  determine  if  there 


exist  any  values  of  u,  such  that  the  system  (4)  is  carried  from 
its  initial  conditions  to  the  terminal  conditions  as  observed.  If 
such  a  u  exists  and  it  is  acceptable,  then  the  flow  field  is 
consistent  with  the  transient  tracer  distributions.  If  no  such  u 
can  be  found,  then  the  flow  field  is  not  consistent  with  the 
transient  tracers,  can  be  rejected,  and  hence  modified.  But  we 
do  not  need  to  grapple  with  the  problem  of  determining  a 
modified  flow  field  until  it  has  been  demonstrated  that  the  old 
one  is  inconsistent  with  the  observations. 

An  "acceptable"  set  of  boundary  conditions  would  be  those 
not  in  conflict  with  what  is  known  a  priori  about  them.  At  one 
ievel,  the  boundary  concentrations  may  only  have  to  be  posi¬ 
tive  to  be  acceptable.  Or.  one  may  have  some  measurements 
of  them  to  which  the  calculations  most  conform  within  the 
error  estimate.  We  will  show  below  how  to  explicitly  accom¬ 
modate  numerical  values  of  boundary  data  where  they  are 
availiable.  For  the  moment,  however,  attention  is  confined  to 
the  case  where  they  are  only  required  to  be  physically  realiz¬ 
able  (i.e.,  positive  and  not  infinite  in  value). 

It  is  possible  to  attempt  simultaneously  to  modify  .he  model 
and  the  boundary  conditions  initial  conditions,  but  given  the 
size  and  complexity  of  time  evolving  systems,  there  is  real 
advantage  to  being  able  to  adopt  a  stepwise  approach.  One 
first  asks  the  simpler  question  of  whether  all  discrepancies  can 
be  eliminated  acceptably  through  modification  of  estimates  of 
boundary  data,  before  setting  out  on  the  potentially  long  road 
of  simultaneously  modifying  the  model  too. 

The  tracer  survey  represents  a  terminal  constraint.  It  is  the 
point  to  which  a  complex  system  must  evolve  at  a  given  time. 

An  analogy  is  the  control  problem  of  a  robotic  arm.  At  t  =  0. 
a  robot  arm  is  at  a  known  initial  position.  A.  At  (  =  t f  it  is 
required  that  the  arm  be  in  a  position.  B,  within  some  toler¬ 
ance,  for  grasping  an  object  within  some  tolerance.  Merely 
observing  that  the  arm  was  initially  at  A  at  f  —  0  and  at  B  at  j 

t f  does  not  tell  us  the  trajectory  or  velocity  that  the  arm  had  i 

in  between.  Conventional  control  problems  arc  to  find  a  tra-  j 

jectory  from  A  to  B  that  minimizes  the  energy  required  to  j 

move  the  arm  or  gives  the  smoothest  trajectory,  etc.  The  i 

tracer  problem  consists  instead  of  determining  whether  any  1 

i 

i 
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Fig.  3 a.  Concentration  time  histories  for  a  14  time  step  forward 
integration.  Boundary  boxes  were  driven  by  the  concentration  rates 
of  change  shown  in  Figure  2  resulting  in  the  concentrations  marked 
"boundary."  Time  histories  for  interior  boxes  are  also  depicted  as 
calculated  from  (4). 

physically  acceptable  trajectory  exists  given  that  the  arm  was 
observed  to  have  been  at  A  at  t  =  0  and  at  B  at  t  =  tf. 

The  problem  is  mathematically  interesting  because  the 
tracer  “control”  is  represented  by  the  boundary  conditions, 
and  they  must  be  determined.  Because  the  tracer  system  is 
governed  by  a  partial  differential  equation,  the  problem  is  that 
of  “distributed  system  boundary  control.”  Readers  interested 
in  the  general  mathematical  issues  are  referred  to  Lions  [1971] 
or  Stavroulakis  [1983a,  6]. 

As  the  intention  is  illustrative  here,  we  will  proceed  to  a 
specific  example.  Mimicking  our  hypothetical  oceanographic 
case,  we  seek  boundary  histories  u  such  that  we  minimize  the 
objective  function 

j  =  [C(r  f)  -  cjrG[C(t/»  -cdj  +  z  «Tmo  (5> 

( 

The  matrix  G  is  proportional  to  the  reciprocal  covariance  of 
the  error  in  the  terminal  state  observations  Cd  (equation  (2)). 
For  present  purposes,  it  suffices  to  notice  that  by  varying  G, 
we  can  weight  the  demand  for  reproduction  of  the  terminal 
state  survey  against  the  demand  for  a  minimum  “energy"  u. 


Equation  (5)  is  chosen  as  only  one  example  of  the  general  :iass 
of  L2  norm  objective  functions.  We  are  omitting  the  possibility 
that  any  of  the  elements  of  the  terminal  state  should  be  repro¬ 
duced  exactly,  i.e.,  we  do  not  demand  that 

CM/)  =  Cdi  for  any  i  (6) 

without  error.  Although  the  control  formalism  admits  of  such 
requirements  (see  the  discussion  in  the  work  by  Luenberger 
[1979]),  they  needlessly  complicate  the  mathematics  and  are 
unlikely  ever  to  be  realistic  in  any  case.  The  terminal  state  can 
be  pushed  arbitrarily  close  to  the  observations  by  use  of  G, 
without  encountering  the  degeneracy  involved  in  imposing  (6). 
(The  fundamental  difficulty  is  that  exact  satisfaction  of  the 
terminal  state  often  renders  the  second  term  of  J  irrelevant, 
there  being  only  one,  or  even  no,  possible  values  of  u  produc¬ 
ing  a  fixed  terminal  state.) 

The  second  term  on  the  right  of  (5)  is  only  an  example, 
which  is  most  appropriate  if  there  was  reason  to  seek  the 
smallest  mean  square  injection  consistent  with  the  system  and 
the  terminal  constraint  observations.  No  implication  that  this 
is  necessarily  the  most  useful  such  objective  function  is  intend¬ 
ed,  and  much  more  complex  ones  can  be  used.  In  section  2.4. 
we  will  extend  (5)  to  minimize  the  mean  square  deviation  from 
an  a  priori  estimate  which  differs  from  the  zero  a  priori  value 
implicit  there. 

Solution  of  this  unorthodox  problem  is  usually  obtained 
through  either  the  so-called  Pontryagin  minimum  principle,  or 
dynamic  programming  methods.  For  systems  of  the  present 
type,  the  minimum  principle  appears  to  be  computationally 
more  feasible.  1  will  not  derive  the  discrete  time  optimality 
theorem  (see  Luenberger  [1979],  who  however  refers  to  it  as 
the  "maximum  principle”  having  introduced  a  sign  change). 
The  reader  is  willing,  I  hope,  to  take  the  theorem  on  faith. 
Thacker  and  Long  [1988]  derive  a  version  of  it.  It  can  be 
stated  fairly  concisely,  although  not  in  its  most  general  form 
as  follows.  Let  H  be  a  Hamiltonian  defined  as 

H(C(t),  X(t),  u(t))  =  kr(t)[AC(t)  +  Bu(f)]  +  ur(/)u(r)  (7) 

Let  the  system  satisfy  the  evolution  equation  (4),  subject  to  the 
initial  conditions  C(r  =  0)  =  C0,  and  the  minimum  of  (5)  is 


Fig.  3 b.  Concentrations  at  terminal  lime  t f  -  14 
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Fig.  4 a.  A  biased  calculation  in  which  box  5  had  the  (by  definition 
erroneous)  concentration  shown,  leading  to  the  erroneous  final  state 
shown  in  Figure  4b. 


sought.  Then  there  exists  an  adjoint  system  trajectory  Mt)  such 
that 

MO  =  ArX(/  +  1)  (8) 

subject  to  the  adjoint  terminal  boundary  condition 

Mtf)  =  2G(C({/)  -  C,)  (9) 

and  such  that  the  Hamiltonian  (7)  is  stationary  over  u: 

PH 

—  =  0  (10) 

u 

(This  theorem  has  direct  generalizations  to  nonlinear  systems 
and  continuous  time  ones.) 

Applying  the  theorem  to  the  system  (4),  the  Hamiltonian 
minimum  condition  leads  to 

B'L(r)  =  —  2u(r)  u(f)  =  — (l/2)BrX(r)  (11) 

giving  the  control  in  terms  of  X.  The  X  are  Lagrange  multi¬ 
pliers,  and  perhaps  the  most  important  thing  we  can  observe 
about  them  is  that  they  satisfy  an  evolution  equation  (8),  in¬ 
volving  backwards  running  time,  and  the  adjoint  of  the  matrix 
A.  Such  adjoint  systems  pervade  control  problems  and  have 


been  widely  recognized  [e  g.,  Lewis  and  Derber,  1985;  LeDimel 
ami  Talayrand.  1986;  Thacker  and  Long.  1988]  as  the  key  to 
system  sensitivity.  In  the  present  case,  the  X  represent  the 
systematic,  stable,  backwards  propagation  of  information 
about  the  terminal  constraint  to  the  prior  time  history  of  the 
system.  It  can  be  shown  that  the  Green’s  functions  employed 
in  P2  satisfy  the  adjoint  system  (8).  As  is  usually  the  case  [e.g., 
Schriiter  and  Wunsch.  1986],  the  X  are  readily  interpretable  as 
the  sensitivity  of  the  objective  function  to  perturbations  in  the 
tracer  concentration  C(t)  at  any  lime  t.  i.e., 

AJ(t)  =  -XTU)ACU)  (12) 

To  solve  this  system  (equations  (7)— (9))  and  (11),  we  must 
find  XU)  and  C(t)  such  that  (7)  and  (9)  are  simultaneously 
satisfied.  The  system  is  awkward  because  the  C(t)  evolution 
equation  must  be  solved  forwards  in  time,  and  the  X(t)  equa¬ 
tion  backwards  in  time,  with  the  boundary  conditions  on  C 
applied  at  /  =  0.  and  those  for  X  at  t  =  tf.  This  latter  con¬ 
dition  is  particularly  awkward  because  the  value  of  C(if)  ap¬ 
pearing  in  (9)  is  unknown  until  the  problem  is  solved. 

Proceeding  with  A  constant,  it  follows  from  (8)  that 

Ml)  =  (Ar)t/_,X(t/)  (13) 

and  specifically  that 

X(0)  =  (A T)"Mtr)  =  (A^GrCl/,)  -  CJ  (14) 

We  can  therefore  develop  the  following  sequence  from  (4): 

C(  1  )  =  AC0  -  BBrX(0)/2  =  AC0  —  BBr(  A  r)"G(C(f/)  —  CJ 
C(2)  =  AC1 "  -  BB7  X(  I  )/2  =  A7C0  —  ABBr>  <0)/2  -  BBrX(l)/2 
=  A lC0  —  ABBT(  A '  )'/G(C(r  f)  ~  C„)  -  BB  W)"  ' 1  G(C(t/)  -  C„) 


C(f/)  =  A CUf  -  I)  -  BB^f,  -  1  )/2 

=  A"C0-(A)''  lBBT(Ar)"G(C(f/)  -  C,,)  +  •  •  •  (15) 

Using  (13)  and  (14)  we  see  that  this  last  expression  involves 
only  terms  in  C(t/),  C0.  and  Cd  on  the  right.  Collecting  all  the 
coefficients  of  C(t/)  on  the  left-hand  side,  we  can  solve  ex¬ 
plicitly  for  CUf)  by  a  single  matrix  inversion.  (The  matrix 
dimension  is  that  of  A  Existence  of  the  inverse  of  this  matrix 
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Fig.  4b.  Concentration  resulting  at  tf  when  flow  is  precisely  the  same  as  in  Figures  I  and  3.  For  obvious  reasons,  the 

'‘right**  model  leads  to  an  erroneous  terminal  state. 
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Fig.  5a.  Bottom  panel  shows  concentration  rates  of  change  u  in 
boundary  boxes  2,  3.  and  5  (same  as  the  time  histories  displayed  in 
Figure  3).  Top  panel  displays  the  concentration  rates  of  change  as 
determined  from  control  solutions  of  (15),  (8),  and  (11).  These  differ 
markedly  from  the  “correct"  solution  in  the  lower  panel,  but  drive  the 
model  to  the  observed  terminal  concentrations,  within  slight  per¬ 
missible  errors.  The  top  panel  histories  are  those  with  smallest  mean 
square  with  that  property  and  clearly  demonstrate  the  nonunique 
character  of  the  terminal  constraint  solution.  Box  9  is  a  passive  reser¬ 
voir.  and  hence  the  value  of  Bu  there  is  set  to  zero. 

is  related  to  the  ideas  of  “controllability”  and  “observability” 
of  a  system,  which  we  will  not  take  up  here.  Should  the  matrix 
be  singular,  various  generalized  inverses  can  be  used.) 

With  C (tf)  then  known  in  terms  of  C0  and  Cd,  we  can  now 
go  back  and  calculate  u  explicitly,  as  well  as  C(f).  The  control 
u  is  what  is  known  as  an  "open-loop”  one,  because  it  involves 
only  C(f  =  0)  and  C/t f),  both  of  which  are  external  parame¬ 
ters.  (So-called  closed-loop,  or  feedback  controls,  with 
u<()  x  C(r)  can  be  obtained  from  this  solution.)  For  one  exam¬ 
ple,  we  took  G  =  I  and  C J{tf)  from  Figure  3b.  The  simple  form 
of  B  shown  in  Figure  2  was  retained.  The  solution  that  re¬ 
sulted  gave  a  control  variable  u  as  depicted  in  Figure  5a  and 
an  interior  time  history  as  shown  in  Figure  5b.  The  terminal 
state  (Figure  5c)  found  differs  slightly  from  the  “true”  terminal 
state  of  Figure  3 h  because  we  permitted  a  tradeoff  between 
deviations  from  the  exact  observations  and  minimization  of  u 
over  the  time  history.  When  G  =  101,  the  terminal  state  re¬ 
sulting  (not  shown)  is  closer  to  the  observations,  but  the  mag¬ 
nitude  of  u  has  necessarily  increased.  The  X.  are  displayed  in 
Figure  5 d. 

These  solutions  were  obtained  in  a  two-pass  system:  we 
have  to  solve  the  system  once  completely  to  determine  C(fy) 
and  then  again  to  calculate  C(f).  But  if  the  control  u  is  accept¬ 
able,  we  have  demonstrated  that  our  model  has  passed  its 
consistency  requirements  for  reproducing  the  terminal  state.  If 
no  acceptable  u  can  be  found,  then  we  are  assured  that  no 
error  in  the  boundary  conditions  can  explain  the  failure,  and 
we  must  modify  the  A  matrix  (i.c.,  the  J  tJ  parameters  defining 
this  model).  In  carrying  out  the  calculations  just  described,  a 
lot  of  matrix  multiplications,  but  only  one  matrix  inversion,  of 
a  matrix  of  dimension  equal  to  the  number  of  boxed,  are 
required. 


2.4.  Further  Constraints 

A  not  uncommon  tracer  situation  is  one  in  which  two  or 
more  surveys  are  available  and  the  boundary  conditions  and 
other  information  must  be  consistent  with  both.  (A  restricted 
version  of  this  situation  was  solved  in  P2). 

A  second  form  of  constraints  on  the  boundary  conditions 
are  represented  in  the  problem  worked,  through  supposed 
knowledge  of  the  surface  boundary  concentration  histories; 
i.e..  we  did  not  use  directly  any  knowledge  of  C(t)  in  the  sur¬ 
face  boxes.  This  type  of  information  can  be  accommodated  in 
a  number  of  ways  [e.g.,  Stengel ,  1986],  One  simple  approach  is 
to  modify  the  objective  function  from  (5)  to 

J  =  [C(tf)  -  Cj]TG[C(tf)  -  C,] 

+  £  [u(r)  -  u0(t)]rF[u(f)  -  u0(t)] 

f 

where  u0  represents  any  prior  estimate  of  what  the  control 
ought  to  be  and  F  is  another  reciprocal  covariance  matrix. 
The  calculation  proceeds  as  before,  with  the  second-term  of 
the  Hamiltonian  in  (7)  modified  appropriately. 

2.5.  Time-Dependent  Inversion 

Although  we  will  not  pursue  it  in  detail  at  this  time,  the 
control  formalism  suggests  an  approach  to  model  change  if  it 
must  be  modified  to  bring  it  into  accord  with  the  observa¬ 
tions,  simple  changes  in  the  boundary  conditions  having 
proved  inadequate.  We  only  sketch  the  procedure,  leaving  de¬ 
tails  to  the  future. 

A  standard  control  representation  is  the  so-called  feedback 

form: 

ii  =  —  K(r)C(r)  (16) 

where  the  control  at  time  t  is  chosen  to  depend  explicitly  upon 
the  value  of  the  state  at  that  time.  (The  form  used  in  section 
2.3  is  the  so-called  open  loop  control  representation;  text- 


Fig.  5 h.  Bottom  panel  shows  concentration  time  histories  in  the 
interior  boxes  in  the  reference  calculation  (same  as  depicted  in  Figure 
3<i),  and  the  top  panel  interior  time  histories  when  driven  by  the 
boundary  control  of  upper  panel  of  Figure  5<t. 
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Fig.  5c.  Terminal  state  corresponding  to  Figures  6 a  and  66.  and  differing  from  the  correct  state  of  Figure  36  because 
G  =  1.01  and  the  tradeoff  permitted  between  deviation  from  the  terminal  constraint  and  the  magnitude  of  u. 


books  show  how  to  derive  one  from  the  other.)  As  before, 
some  objective  function  involving  u  is  chosen  to  reflect  the 
needs  of  the  particular  situation.  Suppose  the  boundary  con¬ 
ditions  are  held  fixed,  and  the  control  is  applied  instead  to  the 
model  parameters. 

In  feedback  form,  (4)  becomes 

C(t  +  1)  =  AC(t)  -  BK(t)C(f) 

=  (A  -  BK(r))C(r)  =  A'(t)C(t)  (17) 

absorbing  K  into  A  to  generate  a  new  system  matrix  A'.  If  an 
appropriate  K  has  been  found  which  reproduces  the  required 
terminal  constraint,  one  has  a  modified  model,  given  by  A', 
which  is  consistent  with  the  observed  data. 

This  latter  conclusion  can  be  accepted  only  if  the  structure 
of  A'  is  physically  consistent.  Thus  the  elements  of  A  are 
composed  of  balancing  sums  of  the  Ju  as  discussed  in  section 
2.  To  assure  that  A'  has  elements  J,/  satisfying  mass  conser¬ 
vation  in  each  box  would  require  solving  the  feedback  control 
problem  subject  to  these  additional  constraints.  In  principle, 
these  additional  constraints  can  be  accommodated  by  existing 
control  methods,  but  such  a  model  modification  procedure 
has  not  yet  been  attempted  in  practice  for  the  tracer  problem. 

If  K(t)  is  permitted  to  vary  with  time,  the  new  Jtj'  found 
would  also  be  time-dependent.  One  can  anticipate  that  feed¬ 
back  control  methods  will  eventually  provide  the  key  to  in¬ 
verting  models  with  time-dependent  flow  fields. 

3.  State  Estimation 

3.1.  Recursive  Improvement,  Forward  in  Time 

Control  methods  encompass  a  variety  of  goals  associated 
with  time  evolving  systems  and  their  connection  with  realistic 
observations.  Section  2  was  directed  to  control  per  se,  relating 
missing  boundary  conditions  to  observations  of  a  terminal 
state.  The  calculation  of  the  state  C(t)  given  the  set  of  observa¬ 
tions  (2)  while  simultaneously  improving  prior  estimates  of  the 
boundary  condition  is  also  advantageously  examined  in  the 
control  format  because  it  permits  a  flexible  and  efficient  ap¬ 
proach  to  combining  data  with  observations. 

Several  reasons  exist  for  tackling  the  state  estimation  prob¬ 
lem,  as  distinct  from  the  control  problem  just  described.  The 
nature  of  oceanographic  observations  is  such  that  they  are 


acquired  over  significant  periods  of  time,  often  with  large  tem¬ 
poral  gaps  in  any  given  area.  One  may  have  previously  esti¬ 
mated  the  tracer  distribution  within  the  ocean  up  to  and  in¬ 
cluding  the  time  t ,,  of  some  prior  survey.  With  new  data 
available,  one  seeks  to  make  a  best  estimate  of  the  field  using 
the  new  data,  without  having  to  recompute  all  the  previous 
history  as  well.  The  fundamental  motivation  is  that  the  obser¬ 
vations  obtained  at  tf  >  t,  carry  information  that  ought  to  be 
useful  in  improving  the  estimate  of  the  state  at  f ,  and  earlier. 

Suppose  at  time  t  we  observe  the  tracer  concentrations,  in  i 
boxes  in  the  set  /,  iel.  Define  the  square  matrix  E  as  having 
dimension  equal  to  the  total  number  of  boxes,  and  let  all  its 
elements  vanish  except  for  unity  along  the  diagonal  in  row  or 
column  i.  Then  the  observations  z (t)  at  time  t  can  be  written 


z(i)  =  EC(!)  +  n(f) 


(18) 


Fig.  5 d.  Adjoint  solution  (Lagrange  multipliers)  for  problem  dis¬ 
played  in  Figures  Sa  Sc.  Not  surprisingly,  the  most  important  con¬ 
straints  are  the  evolution  equations  at  times  near  the  terminal  time 
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Fig.  6.  (Left)  Correct  concentration  values  for  boundary  (top)  and  interior  boxes  (bottom).  (Right)  Kalman  filter 
estimate,  with  observations  of  the  interior  made  available  at  times  5,  9.  and  12.  Initial  boundary  concentrations  were  set  to 
a  value  of  2,  (correct  value  is  0);  initial  interior  concentrations  were  set  to  0.  but  with  a  large  error  estimate.  Note 
convergence  to  correct  values  as  more  data  are  acquired. 


where  n  represents  the  observational  noise  whose  mean  is  as¬ 
sumed  0  and  whose  covariance  is  R  =  <nnT>. 

Suppose  that  an  estimate  of  C,  called  C(0),  is  known  at 
some  initial  time  e  -  0  (it  might  be  zero)  and  with  an  error 
covariance 

P(0)  =  <(C(0)  -  C(0)XC(0)  -  C(0))r> 

An  estimate  t(l)  of  C  one  time  step  in  the  future  is  com¬ 
putable  from  (4)  and  is 

Cd.  -)  =  AC(0)  +  Bu(0)  (19) 

(the  reason  for  the  minus  sign  will  become  clear).  If  the  initial 
state  estimate  is  uncertain,  then  surely  the  calculation  (19)  is 
also  uncertain,  it  can  be  shown  without  difficulty  [Brogan, 
1982]  that  the  uncertainty  of  t(l,  — )  is 

P(l,  -)  =  AP(0)Ar  +  0  (20) 

Q  has  been  introduced  to  represent  the  covariance  of  any 
unobservable  or  unpredictable  contribution  to  the  driving 
term  in  (d»  This  noise  process  is  assumed  to  have  zero  mean. 

If  some  observations  z(l)  become  available  at  t  —  1,  they 
will  in  general  differ  from  the  initial  estimate  (19).  If  the  differ¬ 
ence  between  estimate  and  observation  lies  outside  the  esti¬ 
mated  uncertainty  of  both  z(l),  C(l),  it  seems  reasonable  that 
we  should  be  able  to  combine  the  measurement  with  the  ini¬ 
tial  estimate,  with  due  regard  for  the  relative  errors  of  both, 
into  a  better  estimate  of  C(l)  than  is  represented  by  either 
alone.  We  therefore  demand  an  improved  estimate,  following 
the  initial  calculation  (19)  and  the  observation  in  the  form 

C(l.  +)  =  C(1,  -)  +  K[z<l)-  EC(1,  -)]  (21) 

where  the  "gain  matrix"  K  must  be  determined.  Structurally, 
the  logic  is  simple.  In  the  absence  of  any  observation,  the  best 
estimate  we  can  make  should  reduce  to  the  calculated  one,  i.e., 
K  =  0;  if  the  difference  between  observation  and  calculation  is 
large,  and  if.  for  example,  the  observational  noise  is  much  less 
than  the  uncertainty  expressed  in  (20),  then  K  becomes  the 


generalized  inverse  of  E;  i.e.,  the  observations  replace  the  cal¬ 
culations.  Thus  the  minus  sign  in  the  argument  denotes  an 
estimate  made  from  the  evolution  equation  alone,  and  the  plus 
sign  the  modified  estimate  made  after  the  observations  are 
used. 

A  general  expression  for  K  can  be  derived  that  minimizes 
the  mean  square  error  of  the  estimate.  It  is  [e.g.,  Liebelt,  1967] 

K  =  P(l,  —  )Er(0)[E(0)P(I,  -)E’(O)  f  R]"1  (22) 

(The  behavior  we  anticipated  K  would  have  can  be  confirmed 
by  examination  of  this  expression).  The  expected  error  of  this 
new  estimate  (21)  can  be  shown  to  be 

P<1,  +)=  [P(l.  -)'1  +  ET(0)R"'E(0)]-' 

=  P(1,  -)-KEP(l,  -)  (23) 

Readers  may  recognize  the  forms  (20H23)  as  a  Kalman 
filter  operation.  M.  Ghil  [e.g.,  Ghil  et  al.,  1981]  has  pioneered 
this  approach  to  meteorological  updating  and  forecasting  and 
a  small  oceanographic  literature  has  followed  that  path  [e.g., 
Brammer  et  al,  1983;  Miller,  1986].  The  new  estimate  (19) 
then  replaces  C(0)  and  time  evolves,  leading  to  the  recursion 

C(r  +  1,  -)  =  Atff,  +)  +  Buff)  (24a) 

Pff  +  1,  — )  =  APff,  +)Ar  +  0  (246) 

Kff)  =  Pff  +  1.  -  )Erff) 

•  [Eff)P(f  +  I,  —  )Eff)r+  Rff)]"’  (24c) 

Pff  +!.+)=  [Pff  +  1.  -)-‘  +  E(t)rR ~ 'ff)E(t)] "  1 

=  Pff  +  1.  -)  -  Kff)Eff)Pff  +  1.  -)  (24 d) 

C(t  +  !,+)  =  C(t  +  l,  -) 

-  K[zff  +  1)  -  EC(r  +  1,  -)]  (24c) 

At  time  steps  with  no  observations,  K  vanishes,  and  we  simply 
continue  without  it. 

Figure  6  shows  how  such  a  calculation  could  work  in  prac- 
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Fig.  7.  Smoothed  boundary  (top)  and  interior  (bottom)  values.  Com¬ 
pare  to  Figure  6. 

tice.  The  initial  concentrations  in  the  boundary  boxes  were 
(deliberately  erroneously)  set  to  C  =  2  (the  correct  value  being 
zero).  The  initial  interior  concentrations  were  correctly  set  to 
zero.  Figure  6(left)  compares  the  correct  interior  con¬ 
centrations  through  time  with  those  calculated  using  the  ini¬ 
tially  incorect  boundary  data.  For  the  first  four  time  steps,  the 
interior  estimates  diverge  rapidly  from  the  correct  values  be¬ 
cause  the  erroneously  high  boundary  estimates  rapidly  dye  the 
interior  boxes. 

At  time  f  =  5,  observations  are  introduced  in  the  interior 
boxes  only,  with  an  estimated  error  covariance  of  R  =  1.01. 
This  set  of  noisy  observations  drives  the  estimated  interior 
observations  toward  the  correct  values.  Another  set  of  obser¬ 
vations  is  introduced  at  time  step  9,  and  again  at  time  step  12. 
As  more  observations  are  included  the  system  state  estimate  is 
gradually  converging  toward  the  correct  values. 

For  purposes  of  this  illustration,  the  error  in  all  the  initial 
boundary  concentrations  was  taken  to  have  a  variance  of  5,  in 
boxes  =  2,  3,  and  5  and  Q  =  0.1 1.  In  this  instance,  Q  repre¬ 
sents  the  error  owing  to  the  failure  to  specify  the  values  of  Bu. 

Even  though  the  assumption  was  made  that  no  subsequent 
boundary  concentration  observations  were  available  following 
the  initial  estimate.  Figure  6  shows  that  the  interior  observa¬ 
tions  are  able  to  improve  the  estimates  of  the  boundary  con¬ 
centrations  too.  simply  because  boundary  values  and  interior 
observations  are  connected  through  the  system  evolution 
matrix  A. 

(All  estimates  shown  are  accompanied  by  a  complete  error 
covariance  matrix  which  has  not  been  displayed  to  keep  the 
figures  uncluttered.) 

3.2.  Smoothing 

Meteorological  oceanographic  practice  until  recently  has  fo¬ 
cussed  on  the  forecasting  calculation,  represented  by  (24a).  We 
now  part  company  with  that  emphasis  by  noting  the  evolving 


and  accumulating  observations  carry  information  about  the 
tracer  concentrations  prior  to  the  time  of  observation.  How 
can  an  observation  of  C (t/)  be  used  to  help  improve  an  esti¬ 
mate  of  C(f),  where  t  <  tf°!  (We  previously  employed  C(r/)  to 
help  estimate  u  at  prior  times;  the  present  problem  seeks  in¬ 
stead  to  use  the  available  information  to  estimate  C(t)  both  on 
boundaries  and  in  the  interior.)  The  question  leads  to  the 
Kalman  “smoother”;  we  follow  Liehelt  [1967,  p.  198]  and 
Meditch  [1973]  and  use  the  Rauch  et  al.  [1965]  algorithm. 
Recent  meteorological  work  [e.g.,  LeDimet  and  Talagrand, 
1986]  directed  at  improving  estimates  of  the  initial  state  of  the 
atmosphere  from  subsequent  observations  are  a  similar  appli¬ 
cation  of  the  smoothing  method. 

Derivation  of  the  algorithm  is  lengthy  (more  so  than  for  the 
Kalman  filter)  and  is  not  described  here.  We  suppose  that  we 
have  retained  all  prior  estimates  t  and  P  from  the  forward  in 
time  calculation  (22).  Then  using  the  plus  sign  to  denote  the 
improved  estimate,  the  Kalman  smoothing  calculation  is 


e+(f)  =  C(t)  +  J(rut:  +  (t  +  1)  -  A C(t))  (25a) 

P+(r)  =  P(f,  +  )  +  J(Pf(r  +  1)  -  P(f  +  1.  -))JT  (25 h) 

J(r)  =  P(t,  +)ArP(t  +  1,  -)"  1  (25c) 

P(t  +  1,  -)  =  AP(i,  +)AT  +  Q  (25d) 


(J  should  not  be  confused  with  the  objective  function  J  or  the 
flow  parameters  J ij).  Notice  particularly  the  appearance  of  the 
transpose  of  A  in  (25c),  which  should  be  no  surprise  in  view  of 
its  previous  appearance  as  the  fundamental  quantity  for  carry¬ 
ing  information  backwards  in  time.  The  essence  of  the  calcula¬ 
tions  is  a  comparison  of  the  prior  estimate  of  C(r)  with  the 
value  computed  backwards  from  later  observations  and  a 
modification  to  the  prior  estimate  as  an  appropriately  weight¬ 
ed  average  of  the  two  values.  The  observations  themselves  do 
not  appear  in  (25)  because  the  information  they  carry  has 
already  been  extracted  in  making  the  Kalman  filter  estimate. 

All  information  has  now  been  exhausted.  There  is  nothing 
further  to  be  gained  by  another  calculation  in  the  forward 
direction:  the  estimates  would  not  change  unless  more  obser¬ 
vations  became  available. 

The  Kalman  smoother  (25)  was  used  to  reestimate  the 
values  of  C.  The  result  is  shown  in  Figure  7.  Notice  the  great 
improvement  in  the  values  at  times  when  no  observations 
were  previously  available,  including  the  much  improved  esti¬ 
mate  of  the  incorrect  initial  state.  Comparison  of  the 
smoothed  estimate  in  the  interior  shows  that  the  backward 
propagation  of  the  future  observations  has  generally  improved 
the  estimates  compared  to  those  from  the  Kalman  filter  esti¬ 
mates.  At  early  times  prior  to  the  first  observations  at  r  =  5. 
the  smoothed  interior  estimates  diverge  from  the  true  state, 
going  unphysically  negative.  The  apparent  reason  for  this  be¬ 
havior  was  the  use  of  a  large  initial  error  estimate,  for  the 
interior  initialization  at  t  =  0.  When  the  initial  interior  error 
estimate  was  made  much  smaller,  the  system  drove  the  esti¬ 
mates  of  the  initial  boundary  concentrations  much  closer  to 
zero,  rather  than  permitting  them  to  asymptote  to  nonzero 
values,  and  the  interior  initial  values  of  0  were  much  more 
closely  recovered  (this  case  is  not  shown  here). 

The  error  estimates  for  the  interior  values  at  t  =  0  in  Figure 
7  are  %  ±  1.5  and  thus  within  the  formal  errors,  are  unphysical 
negative  values  are  indistinguishable  from  zero.  It  is  also  clear 
that  imposition  of  positivity  constraints  on  the  system  would 
be  helpful  additional  information 
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4.  Some  General  Comments 

The  Kalman  filter  has  been  much  discussed  and  used  since 
its  introduction  in  1960  (see  the  history  in  Sorenson  [1985]). 
Although  sometimes  appearing  extremely  mysterious,  the 
basic  ideas  it  embodies  are  straightforward,  and  our  own  dis¬ 
cussion  here  has  not  introduced  anything  original  into  the 
subject.  The  purpose  of  its  discussion  here  is  twofold:  the 
application  of  the  filter  method  into  an  advection-difTusion 
system  does  seem  to  have  some  practical  result,  and  more 
important,  the  control  ideas  described  above,  and  the  Kalman 
state  estimation  are  so-called  dual  problems  of  each  other  [see 
Stengel,  1986].  That  is,  somewhat  surprisingly,  the  control 
problem  and  the  state  estimation  problem,  both  of  which  rely 
upon  a  state  estimate,  are  mathematically  equivalent. 

The  numerical  operations  demanded  by  the  algorithm  dis¬ 
cussed  here  are  easy  to  implement  for  small-scale  systems  (all 
the  examples  displayed  were  generated  on  an  IBM  PC-XT 
using  the  software  system  entitled  PC-MATLAB,  for  which 
matrix  operations  are  both  easy  to  code,  and  very  fast).  For 
basin  scale  general  circulation  models,  the  number  of  degrees 
of  freedom  grows  rapidly  to  the  point  that  mere  storage  of  the 
arrays  becomes  a  major  problem.  (In  the  alternative  dynamic 
programming  approach  to  control,  the  size  issue  is  colorfully 
known  by  the  mathematician  Richard  Bellman’s  label  as  “the 
c  ~  of  dimensionality”.) 

K  paid  almost  no  attention  here  to  the  question  of  the 
extc.  to  which  the  implementation  of  equations  such  as  (25) 
remains  feasible  for  ocean  basin  models  [G/ti/  el  al.,  1981, 
Thacker  and  Long  [1988]  and  others  however  discuss  the 
question  for  equation  (24)  at  considerable  length).  There  are 
two  distinct  reasons  for  this  neglect.  First,  it  is  useful  to  sepa¬ 
rate  understanding  of  ideal  solutions  from  questions  of  practi¬ 
cal  implementation.  The  intellectual  demands  of  the  methods 
described  are  sufficiently  great  that  distractions  of  numerical 
efficiency  are  best  delayed.  Second,  the  rate  of  improvement 
both  in  computing  power  and  numerical  algorithms  remains 
so  rapid,  that  computations  infeasible  today  may  well  become 
almost  trivial  in  10  years.  Thus  it  seems  unnecessary  to  be 
concerned  overly  much  at  this  stage  with  detailed  questions  of 
precisely  what  size  problems  are  feasible  or  not. 

Nothing  has  been  said  here  about  direct  use  of  the  continu¬ 
ous  time  formulation  of  the  control  problem;  we  went  im¬ 
mediately  from  (1)  to  (4).  As  with  inverse  methods,  confusion 
has  been  engendered  by  claims  that  continuous  time  methods 
(variational  principles)  have  some  intrinsic  advantage  over  dis¬ 
crete  methods.  To  my  knowledge,  there  are  no  practical  prob¬ 
lems  in  which  one  cannot  formulate  the  problem  wholly  as  a 
discrete  one.  There  may  be  difficult  questions  of  what  consti¬ 
tutes  an  adequate  discretization,  but  that  problem  is  separable 
from  the  question  of  whether  or  not  the  fundamental  practical 
questions  can  be  answered,  given  that  a  sensible  discretization 
has  been  carried  out.  There  is  little  doubt  that  the  continuous 
time  formulations  are  mathematically  elegant  (lying  largely  in 
the  realm  of  functional  analysis)  and  interesting  (see  Lions 
[1971]  and  the  collection  of  papers  edited  by  Stavroulakis 
[1983]).  But  the  intuitive  advantage  of  working  in  finite  di¬ 
mensional  vector  spaces,  and  the  consequent  ability  to  reduce 
most  problems  to  some  form  of  least  squares  should  not  be 
minimized.  Lanczos  [1961]  makes  particularly  clear  the  con¬ 
nection  between  the  continuous  and  discrete  problems.  We 
can  also  point  directly  at  the  control  literature  itself,  wherein 
many  textbooks  develop  the  continuous  and  discrete  control 
and  estimation  problems  in  parallel,  with  experience  suggest¬ 
ing  no  intrinsic  advantage  of  one  approach  over  the  other. 


Appendix 

Wc  justify  the  assertion  made  in  the  text  that  a  finite  differ¬ 
ence  formulation  of  the  tracer  problem  can  be  put  in  the 
canonical  form  of  (4).  For  example,  if  wc  discretize  the  Lapla- 
ctan  with  x  a  scalar,  u  and  r  constant,  and  using  upwind 
differencing  following  Roadie  [1976],  we  have  a  simple  ex¬ 
plicit  scheme: 

C,r  1  -  cy  +  RiuCd  -  uC,  ,  ')  +  RU'Cj/  -  rC(J  ,*» 

- rf{cy ,./  +  c,  ,./ -  2cy  +  ictJ. /  +  c,4  -  2cyii 

=  0  (Al) 

R  =  17  A/,  Ax  d  =  xAt/Ax2 

V  being  a  velocity  scale  and  the  other  variables  are  conven¬ 
tional.  (Al)  is,  in  turn,  of  the  form  of  the  homogeneous  version 
of  (4),  the  boundary  data  being  added  in  an  obvious  way.  An 
implicit  scheme  would  require  an  extra  matrix  inversion  to 
reduce  to  (4). 
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